Overview

Brought to you by YData

Dataset statistics

 Training DataOriginal Data
Number of variables1212
Number of observations59399420000
Missing cells00
Missing cells (%)0.0%0.0%
Duplicate rows00
Duplicate rows (%)0.0%0.0%
Total size in memory54.4 MiB1.8 MiB
Average record size in memory96.0 B96.0 B

Variable types

 Training DataOriginal Data
Numeric55
Categorical77

Alerts

Training DataOriginal Data
credit_score is highly overall correlated with grade_subgrade and 1 other fields credit_score is highly overall correlated with grade_subgrade and 1 other fieldsHigh correlation
employment_status is highly overall correlated with loan_paid_back employment_status is highly overall correlated with loan_paid_backHigh correlation
grade_subgrade is highly overall correlated with credit_score grade_subgrade is highly overall correlated with credit_scoreHigh correlation
interest_rate is highly overall correlated with credit_score interest_rate is highly overall correlated with credit_scoreHigh correlation
loan_paid_back is highly overall correlated with employment_status loan_paid_back is highly overall correlated with employment_statusHigh correlation

Reproduction

 Training DataOriginal Data
Analysis started2025-11-14 17:00:20.7869962025-11-14 17:00:51.510318
Analysis finished2025-11-14 17:00:38.3809252025-11-14 17:00:56.083976
Duration17.59 seconds4.57 seconds
Software versionydata-profiling vv4.17.0ydata-profiling vv4.17.0
Download configurationconfig.jsonconfig.json

Variables

annual_income
Real number (ℝ)

 Training DataOriginal Data
Distinct11972819947
Distinct (%)20.2%99.7%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean48212.20343549.638
 Training DataOriginal Data
Minimum6002.436000
Maximum393381.74400000
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
2025-11-14T11:00:56.554976image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataOriginal Data
Minimum6002.436000
5-th percentile15450.1113377.715
Q127934.424260.753
median46557.6836585.26
Q360981.3254677.917
95-th percentile93534.6897663.691
Maximum393381.74400000
Range387379.31394000
Interquartile range (IQR)33046.9230417.165

Descriptive statistics

 Training DataOriginal Data
Standard deviation26711.94228668.58
Coefficient of variation (CV)0.55404940.65829663
Kurtosis7.091412610.952957
Mean48212.20343549.638
Median Absolute Deviation (MAD)17068.914200.04
Skewness1.71950872.3307532
Sum2.8637759 × 10108.7099276 × 108
Variance7.1352785 × 1088.2188746 × 108
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:56.776659image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
51351.71238
 
< 0.1%
25499.88227
 
< 0.1%
24113.12219
 
< 0.1%
56547.75209
 
< 0.1%
26386.33187
 
< 0.1%
28991.07185
 
< 0.1%
16077.08170
 
< 0.1%
46949.29160
 
< 0.1%
53981.9152
 
< 0.1%
52628.69146
 
< 0.1%
Other values (119718)592101
99.7%
ValueCountFrequency (%)
600026
 
0.1%
28316.412
 
< 0.1%
26386.332
 
< 0.1%
25860.672
 
< 0.1%
18205.782
 
< 0.1%
40010.062
 
< 0.1%
16664.342
 
< 0.1%
17306.582
 
< 0.1%
56547.752
 
< 0.1%
36822.032
 
< 0.1%
Other values (19937)19956
99.8%
ValueCountFrequency (%)
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6026.313
< 0.1%
6026.471
 
< 0.1%
6026.711
 
< 0.1%
6064.781
 
< 0.1%
6071.691
 
< 0.1%
6073.151
 
< 0.1%
6074.921
 
< 0.1%
6093.551
 
< 0.1%
ValueCountFrequency (%)
600026
0.1%
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6018.91
 
< 0.1%
6026.311
 
< 0.1%
6100.321
 
< 0.1%
6105.991
 
< 0.1%
6109.871
 
< 0.1%
6151.161
 
< 0.1%
6166.421
 
< 0.1%
ValueCountFrequency (%)
600026
< 0.1%
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6018.91
 
< 0.1%
6026.311
 
< 0.1%
6100.321
 
< 0.1%
6105.991
 
< 0.1%
6109.871
 
< 0.1%
6151.161
 
< 0.1%
6166.421
 
< 0.1%
ValueCountFrequency (%)
6002.431
 
< 0.1%
6008.561
 
< 0.1%
6026.313
< 0.1%
6026.471
 
< 0.1%
6026.711
 
< 0.1%
6064.781
 
< 0.1%
6071.691
 
< 0.1%
6073.151
 
< 0.1%
6074.921
 
< 0.1%
6093.551
 
< 0.1%

debt_to_income_ratio
Real number (ℝ)

 Training DataOriginal Data
Distinct526555
Distinct (%)0.1%2.8%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean0.120695890.1770193
 Training DataOriginal Data
Minimum0.0110.01
Maximum0.6270.667
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
2025-11-14T11:00:56.990112image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataOriginal Data
Minimum0.0110.01
5-th percentile0.0460.037
Q10.0720.096
median0.0960.16
Q30.1560.241
95-th percentile0.2590.376
Maximum0.6270.667
Range0.6160.657
Interquartile range (IQR)0.0840.145

Descriptive statistics

 Training DataOriginal Data
Standard deviation0.0685732590.10505934
Coefficient of variation (CV)0.568149070.59349087
Kurtosis2.335230.36506797
Mean0.120695890.1770193
Median Absolute Deviation (MAD)0.0320.07
Skewness1.40667990.78773924
Sum71692.6353540.386
Variance0.00470229180.011037464
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:57.188444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0911440
 
1.9%
0.09311160
 
1.9%
0.0979508
 
1.6%
0.0799099
 
1.5%
0.0948976
 
1.5%
0.0988647
 
1.5%
0.0718192
 
1.4%
0.0967715
 
1.3%
0.0637579
 
1.3%
0.0677373
 
1.2%
Other values (516)504305
84.9%
ValueCountFrequency (%)
0.093101
 
0.5%
0.0996
 
0.5%
0.11692
 
0.5%
0.09791
 
0.5%
0.1191
 
0.5%
0.07990
 
0.4%
0.1390
 
0.4%
0.09489
 
0.4%
0.06389
 
0.4%
0.1288
 
0.4%
Other values (545)19083
95.4%
ValueCountFrequency (%)
0.011169
< 0.1%
0.01255
 
< 0.1%
0.013127
< 0.1%
0.014243
< 0.1%
0.015138
< 0.1%
0.01680
 
< 0.1%
0.017205
< 0.1%
0.018186
< 0.1%
0.01961
 
< 0.1%
0.02152
< 0.1%
ValueCountFrequency (%)
0.0184
0.4%
0.01126
 
0.1%
0.01215
 
0.1%
0.01321
 
0.1%
0.01429
 
0.1%
0.01526
 
0.1%
0.01624
 
0.1%
0.01729
 
0.1%
0.01826
 
0.1%
0.01919
 
0.1%
ValueCountFrequency (%)
0.0184
< 0.1%
0.01126
 
< 0.1%
0.01215
 
< 0.1%
0.01321
 
< 0.1%
0.01429
 
< 0.1%
0.01526
 
< 0.1%
0.01624
 
< 0.1%
0.01729
 
< 0.1%
0.01826
 
< 0.1%
0.01919
 
< 0.1%
ValueCountFrequency (%)
0.011169
0.8%
0.01255
 
0.3%
0.013127
0.6%
0.014243
1.2%
0.015138
0.7%
0.01680
 
0.4%
0.017205
1.0%
0.018186
0.9%
0.01961
 
0.3%
0.02152
0.8%

credit_score
Real number (ℝ)

 Training DataOriginal Data
Distinct399399
Distinct (%)0.1%2.0%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean680.91601679.25695
 Training DataOriginal Data
Minimum395373
Maximum849850
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
2025-11-14T11:00:57.388226image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataOriginal Data
Minimum395373
5-th percentile582565
Q1646632
median682680
Q3719727
95-th percentile767794
Maximum849850
Range454477
Interquartile range (IQR)7395

Descriptive statistics

 Training DataOriginal Data
Standard deviation55.42495669.63858
Coefficient of variation (CV)0.081397640.1025217
Kurtosis0.09596164-0.13134364
Mean680.91601679.25695
Median Absolute Deviation (MAD)3647
Skewness-0.16699288-0.070714162
Sum4.0446002 × 10813585139
Variance3071.92574849.5318
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:57.584883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6786526
 
1.1%
6615801
 
1.0%
6745793
 
1.0%
7085661
 
1.0%
6815635
 
0.9%
6725622
 
0.9%
6695618
 
0.9%
6855557
 
0.9%
7135544
 
0.9%
6765508
 
0.9%
Other values (389)536729
90.4%
ValueCountFrequency (%)
850141
 
0.7%
678135
 
0.7%
669129
 
0.6%
683127
 
0.6%
661127
 
0.6%
708127
 
0.6%
685126
 
0.6%
672123
 
0.6%
688123
 
0.6%
703123
 
0.6%
Other values (389)18719
93.6%
ValueCountFrequency (%)
3952
< 0.1%
4311
 
< 0.1%
4352
< 0.1%
4373
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4454
< 0.1%
4462
< 0.1%
4472
< 0.1%
ValueCountFrequency (%)
3731
 
< 0.1%
3951
 
< 0.1%
4351
 
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4431
 
< 0.1%
4453
< 0.1%
4471
 
< 0.1%
4482
< 0.1%
ValueCountFrequency (%)
3731
 
< 0.1%
3951
 
< 0.1%
4351
 
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4431
 
< 0.1%
4453
< 0.1%
4471
 
< 0.1%
4482
< 0.1%
ValueCountFrequency (%)
3952
< 0.1%
4311
 
< 0.1%
4352
< 0.1%
4373
< 0.1%
4391
 
< 0.1%
4401
 
< 0.1%
4411
 
< 0.1%
4454
< 0.1%
4462
< 0.1%
4472
< 0.1%

loan_amount
Real number (ℝ)

 Training DataOriginal Data
Distinct11157018819
Distinct (%)18.8%94.1%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean15020.29815129.301
 Training DataOriginal Data
Minimum500.09500
Maximum48959.9549039.69
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
2025-11-14T11:00:57.781469image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataOriginal Data
Minimum500.09500
5-th percentile3139.37500
Q110279.628852.695
median15000.2214946.17
Q318858.5820998.868
95-th percentile27139.8329724.311
Maximum48959.9549039.69
Range48459.8648539.69
Interquartile range (IQR)8578.9612146.173

Descriptive statistics

 Training DataOriginal Data
Standard deviation6926.53068605.4055
Coefficient of variation (CV)0.461144690.56879069
Kurtosis-0.15014223-0.32712232
Mean15020.29815129.301
Median Absolute Deviation (MAD)4386.476070.715
Skewness0.207359820.25309257
Sum8.9219667 × 1093.0258602 × 108
Variance4797682674053004
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:57.979418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12892.25412
 
0.1%
15212.88338
 
0.1%
16004.97282
 
< 0.1%
1838.88278
 
< 0.1%
17051.01255
 
< 0.1%
15011.15250
 
< 0.1%
18078.57241
 
< 0.1%
12551.14241
 
< 0.1%
18054.98237
 
< 0.1%
8146.24232
 
< 0.1%
Other values (111560)591228
99.5%
ValueCountFrequency (%)
5001120
 
5.6%
11930.422
 
< 0.1%
15222.832
 
< 0.1%
8354.292
 
< 0.1%
10439.052
 
< 0.1%
14988.462
 
< 0.1%
15212.882
 
< 0.1%
5844.812
 
< 0.1%
16181.652
 
< 0.1%
19010.182
 
< 0.1%
Other values (18809)18862
94.3%
ValueCountFrequency (%)
500.091
 
< 0.1%
500.371
 
< 0.1%
500.911
 
< 0.1%
502.911
 
< 0.1%
507.411
 
< 0.1%
507.421
 
< 0.1%
507.463
< 0.1%
507.861
 
< 0.1%
508.341
 
< 0.1%
508.351
 
< 0.1%
ValueCountFrequency (%)
5001120
5.6%
502.911
 
< 0.1%
507.461
 
< 0.1%
512.531
 
< 0.1%
514.51
 
< 0.1%
515.521
 
< 0.1%
517.141
 
< 0.1%
518.181
 
< 0.1%
524.411
 
< 0.1%
525.031
 
< 0.1%
ValueCountFrequency (%)
5001120
0.2%
502.911
 
< 0.1%
507.461
 
< 0.1%
512.531
 
< 0.1%
514.51
 
< 0.1%
515.521
 
< 0.1%
517.141
 
< 0.1%
518.181
 
< 0.1%
524.411
 
< 0.1%
525.031
 
< 0.1%
ValueCountFrequency (%)
500.091
 
< 0.1%
500.371
 
< 0.1%
500.911
 
< 0.1%
502.911
 
< 0.1%
507.411
 
< 0.1%
507.421
 
< 0.1%
507.463
< 0.1%
507.861
 
< 0.1%
508.341
 
< 0.1%
508.351
 
< 0.1%

interest_rate
Real number (ℝ)

 Training DataOriginal Data
Distinct14541365
Distinct (%)0.2%6.8%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean12.35634512.400627
 Training DataOriginal Data
Minimum3.23.14
Maximum20.9922.51
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
2025-11-14T11:00:58.326782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Quantile statistics

 Training DataOriginal Data
Minimum3.23.14
5-th percentile9.18.43
Q110.9910.74
median12.3712.4
Q313.6814.0025
95-th percentile15.7216.48
Maximum20.9922.51
Range17.7919.37
Interquartile range (IQR)2.693.2625

Descriptive statistics

 Training DataOriginal Data
Standard deviation2.00895892.4427288
Coefficient of variation (CV)0.16258520.19698431
Kurtosis0.059797501-0.01614091
Mean12.35634512.400627
Median Absolute Deviation (MAD)1.341.63
Skewness0.0499453150.027425966
Sum7339594.9248012.53
Variance4.03591595.9669241
MonotonicityNot monotonicNot monotonic
2025-11-14T11:00:58.530415image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.312638
 
0.4%
12.522436
 
0.4%
13.352415
 
0.4%
12.822406
 
0.4%
12.232362
 
0.4%
11.262318
 
0.4%
11.62236
 
0.4%
13.782222
 
0.4%
12.092215
 
0.4%
12.812209
 
0.4%
Other values (1444)570537
96.1%
ValueCountFrequency (%)
12.3146
 
0.2%
13.7845
 
0.2%
11.2645
 
0.2%
12.8245
 
0.2%
12.5244
 
0.2%
12.2344
 
0.2%
12.0944
 
0.2%
13.3543
 
0.2%
11.643
 
0.2%
12.542
 
0.2%
Other values (1355)19559
97.8%
ValueCountFrequency (%)
3.21
 
< 0.1%
3.321
 
< 0.1%
3.661
 
< 0.1%
3.791
 
< 0.1%
3.813
< 0.1%
3.831
 
< 0.1%
3.892
< 0.1%
3.921
 
< 0.1%
3.982
< 0.1%
4.011
 
< 0.1%
ValueCountFrequency (%)
3.141
< 0.1%
3.21
< 0.1%
3.631
< 0.1%
3.791
< 0.1%
3.811
< 0.1%
3.921
< 0.1%
3.981
< 0.1%
4.111
< 0.1%
4.181
< 0.1%
4.291
< 0.1%
ValueCountFrequency (%)
3.141
< 0.1%
3.21
< 0.1%
3.631
< 0.1%
3.791
< 0.1%
3.811
< 0.1%
3.921
< 0.1%
3.981
< 0.1%
4.111
< 0.1%
4.181
< 0.1%
4.291
< 0.1%
ValueCountFrequency (%)
3.21
 
< 0.1%
3.321
 
< 0.1%
3.661
 
< 0.1%
3.791
 
< 0.1%
3.813
< 0.1%
3.831
 
< 0.1%
3.892
< 0.1%
3.921
 
< 0.1%
3.982
< 0.1%
4.011
 
< 0.1%

gender
Categorical

 Training DataOriginal Data
Distinct33
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
Female
306175 
Male
284091 
Other
 
3728
Female
10034 
Male
9536 
Other
 
430

Length

 Training DataOriginal Data
Max length66
Median length66
Mean length5.03717885.0249
Min length44

Characters and Unicode

 Training DataOriginal Data
Total characters2992054100498
Distinct characters1010
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowFemaleMale
2nd rowMaleFemale
3rd rowMaleFemale
4th rowFemaleFemale
5th rowMaleOther

Common Values

ValueCountFrequency (%)
Female306175
51.5%
Male284091
47.8%
Other3728
 
0.6%
ValueCountFrequency (%)
Female10034
50.2%
Male9536
47.7%
Other430
 
2.1%

Length

2025-11-14T11:00:58.714980image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:58.834052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:58.935445image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
female306175
51.5%
male284091
47.8%
other3728
 
0.6%
ValueCountFrequency (%)
female10034
50.2%
male9536
47.7%
other430
 
2.1%

Most occurring characters

ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e30034
29.9%
a19570
19.5%
l19570
19.5%
F10034
 
10.0%
m10034
 
10.0%
M9536
 
9.5%
O430
 
0.4%
t430
 
0.4%
h430
 
0.4%
r430
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)100498
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e30034
29.9%
a19570
19.5%
l19570
19.5%
F10034
 
10.0%
m10034
 
10.0%
M9536
 
9.5%
O430
 
0.4%
t430
 
0.4%
h430
 
0.4%
r430
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)100498
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e30034
29.9%
a19570
19.5%
l19570
19.5%
F10034
 
10.0%
m10034
 
10.0%
M9536
 
9.5%
O430
 
0.4%
t430
 
0.4%
h430
 
0.4%
r430
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2992054
100.0%
ValueCountFrequency (%)
(unknown)100498
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e900169
30.1%
a590266
19.7%
l590266
19.7%
F306175
 
10.2%
m306175
 
10.2%
M284091
 
9.5%
O3728
 
0.1%
t3728
 
0.1%
h3728
 
0.1%
r3728
 
0.1%
ValueCountFrequency (%)
e30034
29.9%
a19570
19.5%
l19570
19.5%
F10034
 
10.0%
m10034
 
10.0%
M9536
 
9.5%
O430
 
0.4%
t430
 
0.4%
h430
 
0.4%
r430
 
0.4%

marital_status
Categorical

 Training DataOriginal Data
Distinct44
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
Single
288843 
Married
277239 
Divorced
 
21312
Widowed
 
6600
Single
9031 
Married
8974 
Divorced
1428 
Widowed
 
567

Length

 Training DataOriginal Data
Max length88
Median length77
Mean length6.54960666.61985
Min length66

Characters and Unicode

 Training DataOriginal Data
Total characters3890427132397
Distinct characters1616
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowSingleMarried
2nd rowMarriedMarried
3rd rowSingleSingle
4th rowSingleSingle
5th rowMarriedSingle

Common Values

ValueCountFrequency (%)
Single288843
48.6%
Married277239
46.7%
Divorced21312
 
3.6%
Widowed6600
 
1.1%
ValueCountFrequency (%)
Single9031
45.2%
Married8974
44.9%
Divorced1428
 
7.1%
Widowed567
 
2.8%

Length

2025-11-14T11:00:59.071154image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:59.191152image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:59.301481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
single288843
48.6%
married277239
46.7%
divorced21312
 
3.6%
widowed6600
 
1.1%
ValueCountFrequency (%)
single9031
45.2%
married8974
44.9%
divorced1428
 
7.1%
widowed567
 
2.8%

Most occurring characters

ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i20000
15.1%
e20000
15.1%
r19376
14.6%
d11536
8.7%
g9031
6.8%
l9031
6.8%
n9031
6.8%
S9031
6.8%
a8974
6.8%
M8974
6.8%
Other values (6)7413
 
5.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)132397
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i20000
15.1%
e20000
15.1%
r19376
14.6%
d11536
8.7%
g9031
6.8%
l9031
6.8%
n9031
6.8%
S9031
6.8%
a8974
6.8%
M8974
6.8%
Other values (6)7413
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)132397
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i20000
15.1%
e20000
15.1%
r19376
14.6%
d11536
8.7%
g9031
6.8%
l9031
6.8%
n9031
6.8%
S9031
6.8%
a8974
6.8%
M8974
6.8%
Other values (6)7413
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)3890427
100.0%
ValueCountFrequency (%)
(unknown)132397
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i593994
15.3%
e593994
15.3%
r575790
14.8%
d311751
8.0%
g288843
7.4%
l288843
7.4%
n288843
7.4%
S288843
7.4%
a277239
7.1%
M277239
7.1%
Other values (6)105048
 
2.7%
ValueCountFrequency (%)
i20000
15.1%
e20000
15.1%
r19376
14.6%
d11536
8.7%
g9031
6.8%
l9031
6.8%
n9031
6.8%
S9031
6.8%
a8974
6.8%
M8974
6.8%
Other values (6)7413
 
5.6%

education_level
Categorical

 Training DataOriginal Data
Distinct55
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
Bachelor's
279606 
High School
183592 
Master's
93097 
Other
 
26677
PhD
 
11022
Bachelor's
8045 
High School
5919 
Master's
3724 
Other
1508 
PhD
 
804

Length

 Training DataOriginal Data
Max length1111
Median length1010
Mean length9.64117319.26515
Min length33

Characters and Unicode

 Training DataOriginal Data
Total characters5726799185303
Distinct characters2020
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowHigh SchoolMaster's
2nd rowMaster'sBachelor's
3rd rowHigh SchoolHigh School
4th rowHigh SchoolHigh School
5th rowHigh SchoolOther

Common Values

ValueCountFrequency (%)
Bachelor's279606
47.1%
High School183592
30.9%
Master's93097
 
15.7%
Other26677
 
4.5%
PhD11022
 
1.9%
ValueCountFrequency (%)
Bachelor's8045
40.2%
High School5919
29.6%
Master's3724
18.6%
Other1508
 
7.5%
PhD804
 
4.0%

Length

2025-11-14T11:00:59.436793image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:59.558708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:59.680138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
bachelor's279606
36.0%
high183592
23.6%
school183592
23.6%
master's93097
 
12.0%
other26677
 
3.4%
phd11022
 
1.4%
ValueCountFrequency (%)
bachelor's8045
31.0%
high5919
22.8%
school5919
22.8%
master's3724
14.4%
other1508
 
5.8%
phd804
 
3.1%

Most occurring characters

ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h22195
12.0%
o19883
10.7%
s15493
 
8.4%
c13964
 
7.5%
l13964
 
7.5%
e13277
 
7.2%
r13277
 
7.2%
a11769
 
6.4%
'11769
 
6.4%
B8045
 
4.3%
Other values (10)41667
22.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)185303
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h22195
12.0%
o19883
10.7%
s15493
 
8.4%
c13964
 
7.5%
l13964
 
7.5%
e13277
 
7.2%
r13277
 
7.2%
a11769
 
6.4%
'11769
 
6.4%
B8045
 
4.3%
Other values (10)41667
22.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)185303
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h22195
12.0%
o19883
10.7%
s15493
 
8.4%
c13964
 
7.5%
l13964
 
7.5%
e13277
 
7.2%
r13277
 
7.2%
a11769
 
6.4%
'11769
 
6.4%
B8045
 
4.3%
Other values (10)41667
22.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5726799
100.0%
ValueCountFrequency (%)
(unknown)185303
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
h684489
12.0%
o646790
11.3%
s465800
 
8.1%
c463198
 
8.1%
l463198
 
8.1%
e399380
 
7.0%
r399380
 
7.0%
a372703
 
6.5%
'372703
 
6.5%
B279606
 
4.9%
Other values (10)1179552
20.6%
ValueCountFrequency (%)
h22195
12.0%
o19883
10.7%
s15493
 
8.4%
c13964
 
7.5%
l13964
 
7.5%
e13277
 
7.2%
r13277
 
7.2%
a11769
 
6.4%
'11769
 
6.4%
B8045
 
4.3%
Other values (10)41667
22.5%

employment_status
Categorical

 Training DataOriginal Data
Distinct55
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
Employed
450645 
Unemployed
62485 
Self-employed
52480 
Retired
 
16453
Student
 
11931
Employed
13007 
Self-employed
2923 
Unemployed
2113 
Retired
 
1176
Student
 
781

Length

 Training DataOriginal Data
Max length1313
Median length88
Mean length8.60435968.8442
Min length77

Characters and Unicode

 Training DataOriginal Data
Total characters5110938176884
Distinct characters1818
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowSelf-employedEmployed
2nd rowEmployedEmployed
3rd rowEmployedEmployed
4th rowEmployedEmployed
5th rowEmployedEmployed

Common Values

ValueCountFrequency (%)
Employed450645
75.9%
Unemployed62485
 
10.5%
Self-employed52480
 
8.8%
Retired16453
 
2.8%
Student11931
 
2.0%
ValueCountFrequency (%)
Employed13007
65.0%
Self-employed2923
 
14.6%
Unemployed2113
 
10.6%
Retired1176
 
5.9%
Student781
 
3.9%

Length

2025-11-14T11:00:59.821047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:00:59.937289image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:01:00.063514image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
employed450645
75.9%
unemployed62485
 
10.5%
self-employed52480
 
8.8%
retired16453
 
2.8%
student11931
 
2.0%
ValueCountFrequency (%)
employed13007
65.0%
self-employed2923
 
14.6%
unemployed2113
 
10.6%
retired1176
 
5.9%
student781
 
3.9%

Most occurring characters

ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e29135
16.5%
l20966
11.9%
d20000
11.3%
m18043
10.2%
y18043
10.2%
p18043
10.2%
o18043
10.2%
E13007
7.4%
S3704
 
2.1%
f2923
 
1.7%
Other values (8)14977
8.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)176884
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e29135
16.5%
l20966
11.9%
d20000
11.3%
m18043
10.2%
y18043
10.2%
p18043
10.2%
o18043
10.2%
E13007
7.4%
S3704
 
2.1%
f2923
 
1.7%
Other values (8)14977
8.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)176884
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e29135
16.5%
l20966
11.9%
d20000
11.3%
m18043
10.2%
y18043
10.2%
p18043
10.2%
o18043
10.2%
E13007
7.4%
S3704
 
2.1%
f2923
 
1.7%
Other values (8)14977
8.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5110938
100.0%
ValueCountFrequency (%)
(unknown)176884
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e777892
15.2%
l618090
12.1%
d593994
11.6%
m565610
11.1%
y565610
11.1%
p565610
11.1%
o565610
11.1%
E450645
8.8%
n74416
 
1.5%
S64411
 
1.3%
Other values (8)269050
 
5.3%
ValueCountFrequency (%)
e29135
16.5%
l20966
11.9%
d20000
11.3%
m18043
10.2%
y18043
10.2%
p18043
10.2%
o18043
10.2%
E13007
7.4%
S3704
 
2.1%
f2923
 
1.7%
Other values (8)14977
8.5%

loan_purpose
Categorical

 Training DataOriginal Data
Distinct88
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
Debt consolidation
324695 
Other
63874 
Car
58108 
Home
44118 
Education
36641 
Other values (3)
66558 
Debt consolidation
7981 
Other
2550 
Car
2390 
Home
1972 
Education
1675 
Other values (3)
3432 

Length

 Training DataOriginal Data
Max length1818
Median length189
Mean length12.3807710.64005
Min length33

Characters and Unicode

 Training DataOriginal Data
Total characters7354103212801
Distinct characters2424
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowOtherCar
2nd rowDebt consolidationDebt consolidation
3rd rowDebt consolidationBusiness
4th rowDebt consolidationOther
5th rowOtherCar

Common Values

ValueCountFrequency (%)
Debt consolidation324695
54.7%
Other63874
 
10.8%
Car58108
 
9.8%
Home44118
 
7.4%
Education36641
 
6.2%
Business35303
 
5.9%
Medical22806
 
3.8%
Vacation8449
 
1.4%
ValueCountFrequency (%)
Debt consolidation7981
39.9%
Other2550
 
12.8%
Car2390
 
11.9%
Home1972
 
9.9%
Education1675
 
8.4%
Business1629
 
8.1%
Medical1196
 
6.0%
Vacation607
 
3.0%

Length

2025-11-14T11:01:00.212745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:01:00.344427image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:01:00.490569image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
debt324695
35.3%
consolidation324695
35.3%
other63874
 
7.0%
car58108
 
6.3%
home44118
 
4.8%
education36641
 
4.0%
business35303
 
3.8%
medical22806
 
2.5%
vacation8449
 
0.9%
ValueCountFrequency (%)
debt7981
28.5%
consolidation7981
28.5%
other2550
 
9.1%
car2390
 
8.5%
home1972
 
7.0%
education1675
 
6.0%
business1629
 
5.8%
medical1196
 
4.3%
vacation607
 
2.2%

Most occurring characters

ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o28197
13.3%
i21069
9.9%
t20794
9.8%
n19873
9.3%
e15328
 
7.2%
a14456
 
6.8%
s12868
 
6.0%
c11459
 
5.4%
d10852
 
5.1%
l9177
 
4.3%
Other values (14)48728
22.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)212801
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o28197
13.3%
i21069
9.9%
t20794
9.8%
n19873
9.3%
e15328
 
7.2%
a14456
 
6.8%
s12868
 
6.0%
c11459
 
5.4%
d10852
 
5.1%
l9177
 
4.3%
Other values (14)48728
22.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)212801
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o28197
13.3%
i21069
9.9%
t20794
9.8%
n19873
9.3%
e15328
 
7.2%
a14456
 
6.8%
s12868
 
6.0%
c11459
 
5.4%
d10852
 
5.1%
l9177
 
4.3%
Other values (14)48728
22.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)7354103
100.0%
ValueCountFrequency (%)
(unknown)212801
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o1063293
14.5%
t758354
10.3%
i752589
10.2%
n729783
9.9%
e490796
 
6.7%
a459148
 
6.2%
s430604
 
5.9%
c392591
 
5.3%
d384142
 
5.2%
l347501
 
4.7%
Other values (14)1545302
21.0%
ValueCountFrequency (%)
o28197
13.3%
i21069
9.9%
t20794
9.8%
n19873
9.3%
e15328
 
7.2%
a14456
 
6.8%
s12868
 
6.0%
c11459
 
5.4%
d10852
 
5.1%
l9177
 
4.3%
Other values (14)48728
22.9%

grade_subgrade
Categorical

 Training DataOriginal Data
Distinct3030
Distinct (%)< 0.1%0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
C3
58695 
C4
55957 
C2
54443 
C1
53363 
C5
53317 
Other values (25)
318219 
C3
1514 
C4
1463 
C2
1436 
C5
1422 
C1
1410 
Other values (25)
12755 

Length

 Training DataOriginal Data
Max length22
Median length22
Mean length22
Min length22

Characters and Unicode

 Training DataOriginal Data
Total characters118798840000
Distinct characters1111
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st rowC3B5
2nd rowD3F1
3rd rowC5B4
4th rowF1A5
5th rowD1D5

Common Values

ValueCountFrequency (%)
C358695
9.9%
C455957
 
9.4%
C254443
 
9.2%
C153363
 
9.0%
C553317
 
9.0%
D137029
 
6.2%
D336694
 
6.2%
D435097
 
5.9%
D234432
 
5.8%
D532101
 
5.4%
Other values (20)142866
24.1%
ValueCountFrequency (%)
C31514
 
7.6%
C41463
 
7.3%
C21436
 
7.2%
C51422
 
7.1%
C11410
 
7.0%
D11155
 
5.8%
D31146
 
5.7%
D41100
 
5.5%
D21091
 
5.5%
D51073
 
5.4%
Other values (20)7190
35.9%

Length

2025-11-14T11:01:00.644563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)

Original Data


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
c358695
9.9%
c455957
 
9.4%
c254443
 
9.2%
c153363
 
9.0%
c553317
 
9.0%
d137029
 
6.2%
d336694
 
6.2%
d435097
 
5.9%
d234432
 
5.8%
d532101
 
5.4%
Other values (20)142866
24.1%
ValueCountFrequency (%)
c31514
 
7.6%
c41463
 
7.3%
c21436
 
7.2%
c51422
 
7.1%
c11410
 
7.0%
d11155
 
5.8%
d31146
 
5.7%
d41100
 
5.5%
d21091
 
5.5%
d51073
 
5.4%
Other values (20)7190
35.9%

Most occurring characters

ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C7245
18.1%
D5565
13.9%
34078
10.2%
44016
10.0%
23983
10.0%
13981
10.0%
53942
9.9%
B3075
7.7%
E1695
 
4.2%
F1551
 
3.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)40000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C7245
18.1%
D5565
13.9%
34078
10.2%
44016
10.0%
23983
10.0%
13981
10.0%
53942
9.9%
B3075
7.7%
E1695
 
4.2%
F1551
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)40000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C7245
18.1%
D5565
13.9%
34078
10.2%
44016
10.0%
23983
10.0%
13981
10.0%
53942
9.9%
B3075
7.7%
E1695
 
4.2%
F1551
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1187988
100.0%
ValueCountFrequency (%)
(unknown)40000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C275775
23.2%
D175353
14.8%
3123538
10.4%
4120203
10.1%
1118761
10.0%
2117635
9.9%
5113857
9.6%
B71251
 
6.0%
E34458
 
2.9%
F27301
 
2.3%
ValueCountFrequency (%)
C7245
18.1%
D5565
13.9%
34078
10.2%
44016
10.0%
23983
10.0%
13981
10.0%
53942
9.9%
B3075
7.7%
E1695
 
4.2%
F1551
 
3.9%

loan_paid_back
Categorical

 Training DataOriginal Data
Distinct22
Distinct (%)< 0.1%< 0.1%
Missing00
Missing (%)0.0%0.0%
Memory size4.5 MiB156.4 KiB
1.0
474494 
0.0
119500 
1
15998 
0
4002 

Length

 Training DataOriginal Data
Max length31
Median length31
Mean length31
Min length31

Characters and Unicode

 Training DataOriginal Data
Total characters178198220000
Distinct characters32
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Training DataOriginal Data
Unique00 ?
Unique (%)0.0%0.0%

Sample

 Training DataOriginal Data
1st row1.01
2nd row0.01
3rd row1.01
4th row1.01
5th row1.01

Common Values

ValueCountFrequency (%)
1.0474494
79.9%
0.0119500
 
20.1%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Length

2025-11-14T11:01:00.774650image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Training Data

2025-11-14T11:01:00.881439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:01:00.976628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
ValueCountFrequency (%)
1.0474494
79.9%
0.0119500
 
20.1%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Most occurring characters

ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)1781982
100.0%
ValueCountFrequency (%)
(unknown)20000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)1781982
100.0%
ValueCountFrequency (%)
(unknown)20000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)1781982
100.0%
ValueCountFrequency (%)
(unknown)20000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0713494
40.0%
.593994
33.3%
1474494
26.6%
ValueCountFrequency (%)
115998
80.0%
04002
 
20.0%

Interactions

Training Data

2025-11-14T11:00:36.131386image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:55.078460image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:32.677822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:52.640348image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.510844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.512588image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.430627image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.055641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.237854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.559759image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.299342image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:55.194297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:32.856896image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:52.778235image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.783870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.626057image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.599501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.163848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.404524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.672298image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.458797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:55.303726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.023954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.183293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.948581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.735678image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.750144image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.263992image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.657355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.774450image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.611100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:55.406063image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.179403image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.287247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.099377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.832212image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.898278image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.353971image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.815478image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.868624image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:36.770721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:55.611443image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:33.340234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.393831image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:34.253960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:53.934153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.063562image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.451580image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

2025-11-14T11:00:35.968511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:00:54.963346image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Correlations

Training Data

2025-11-14T11:01:01.065271image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Original Data

2025-11-14T11:01:01.239069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Training Data

annual_incomecredit_scoredebt_to_income_ratioeducation_levelemployment_statusgendergrade_subgradeinterest_rateloan_amountloan_paid_backloan_purposemarital_status
annual_income1.0000.0040.0050.0080.0090.0040.007-0.003-0.0090.0200.0070.010
credit_score0.0041.000-0.0600.0070.0510.0080.638-0.517-0.0080.2320.0080.011
debt_to_income_ratio0.005-0.0601.0000.0060.0880.0040.0240.026-0.0120.3340.0060.004
education_level0.0080.0070.0061.0000.0120.0040.0130.0080.0050.0250.0110.008
employment_status0.0090.0510.0880.0121.0000.0030.0520.0250.0100.6570.0150.006
gender0.0040.0080.0040.0040.0031.0000.0090.0040.0100.0070.0070.002
grade_subgrade0.0070.6380.0240.0130.0520.0091.0000.1920.0130.2280.0080.013
interest_rate-0.003-0.5170.0260.0080.0250.0040.1921.000-0.0010.1290.0060.006
loan_amount-0.009-0.008-0.0120.0050.0100.0100.013-0.0011.0000.0130.0080.008
loan_paid_back0.0200.2320.3340.0250.6570.0070.2280.1290.0131.0000.0250.001
loan_purpose0.0070.0080.0060.0110.0150.0070.0080.0060.0080.0251.0000.010
marital_status0.0100.0110.0040.0080.0060.0020.0130.0060.0080.0010.0101.000

Original Data

annual_incomecredit_scoredebt_to_income_ratioeducation_levelemployment_statusgendergrade_subgradeinterest_rateloan_amountloan_paid_backloan_purposemarital_status
annual_income1.0000.005-0.0030.0000.0040.0000.004-0.0060.0020.0290.0000.006
credit_score0.0051.000-0.0250.0000.0000.0000.634-0.5510.0080.1980.0000.004
debt_to_income_ratio-0.003-0.0251.0000.0040.0120.0230.0110.007-0.0080.2200.0050.000
education_level0.0000.0000.0041.0000.0050.0000.0000.0000.0000.0180.0100.004
employment_status0.0040.0000.0120.0051.0000.0000.0060.0000.0080.5840.0000.000
gender0.0000.0000.0230.0000.0001.0000.0000.0100.0160.0000.0000.000
grade_subgrade0.0040.6340.0110.0000.0060.0001.0000.2060.0000.1920.0040.000
interest_rate-0.006-0.5510.0070.0000.0000.0100.2061.000-0.0090.1090.0080.000
loan_amount0.0020.008-0.0080.0000.0080.0160.000-0.0091.0000.0000.0000.008
loan_paid_back0.0290.1980.2200.0180.5840.0000.1920.1090.0001.0000.0210.000
loan_purpose0.0000.0000.0050.0100.0000.0000.0040.0080.0000.0211.0000.011
marital_status0.0060.0040.0000.0040.0000.0000.0000.0000.0080.0000.0111.000

Missing values

Training Data

2025-11-14T11:00:37.035638image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.

Original Data

2025-11-14T11:00:55.777128image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
A simple visualization of nullity by column.

Training Data

2025-11-14T11:00:37.593320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Original Data

2025-11-14T11:00:55.986584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
029,367.9900.0847362,528.42013.670FemaleSingleHigh SchoolSelf-employedOtherC31.000
122,108.0200.1666364,593.10012.920MaleMarriedMaster'sEmployedDebt consolidationD30.000
249,566.2000.09769417,005.1509.760MaleSingleHigh SchoolEmployedDebt consolidationC51.000
346,858.2500.0655334,682.48016.100FemaleSingleHigh SchoolEmployedDebt consolidationF11.000
425,496.7000.05366512,184.43010.210MaleMarriedHigh SchoolEmployedOtherD11.000
544,940.3000.05865312,159.92012.240MaleSingleBachelor'sEmployedOtherD11.000
661,574.1600.04269616,907.71013.520OtherSingleHigh SchoolSelf-employedDebt consolidationC51.000
745,953.3100.10065410,111.62012.820FemaleMarriedHigh SchoolEmployedHomeD11.000
830,592.2900.1327137,522.3609.480MaleMarriedBachelor'sEmployedEducationC51.000
917,342.4500.1215489,653.48016.040FemaleMarriedBachelor'sSelf-employedVacationF11.000

Original Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
024,240.1900.07474317,173.72013.390MaleMarriedMaster'sEmployedCarB51
120,172.9800.21953122,663.89017.810FemaleMarriedBachelor'sEmployedDebt consolidationF11
226,181.8000.2347793,631.3609.530FemaleSingleHigh SchoolEmployedBusinessB41
311,873.8400.26480914,939.2307.990FemaleSingleHigh SchoolEmployedOtherA51
425,326.4400.26066316,551.71015.200OtherSingleOtherEmployedCarD51
555,559.8000.08177412,724.02012.730MaleSingleHigh SchoolEmployedDebt consolidationB11
624,642.8800.1657425,905.27012.480MaleSingleBachelor'sUnemployedCarB30
752,610.6900.13581015,136.3508.450FemaleSingleBachelor'sEmployedDebt consolidationA31
862,922.0500.074724500.0009.950OtherSingleHigh SchoolEmployedHomeC41
953,439.8900.37579614,712.38011.810FemaleSingleHigh SchoolEmployedCarB10

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
59398436,169.3400.0916769,986.83014.180FemaleMarriedBachelor'sRetiredDebt consolidationC31.000
59398537,188.4300.17071817,056.52010.470FemaleMarriedBachelor'sEmployedHomeC31.000
59398625,015.3500.07463315,922.61013.910MaleMarriedBachelor'sEmployedDebt consolidationD20.000
59398717,662.6800.07467919,792.92015.480FemaleSingleOtherEmployedDebt consolidationC31.000
59398815,602.2200.05662225,706.47015.750FemaleMarriedHigh SchoolEmployedDebt consolidationD21.000
59398923,004.2600.15270320,958.37010.920FemaleSingleHigh SchoolEmployedBusinessC31.000
59399035,289.4300.1055593,257.24014.620MaleSingleBachelor'sEmployedDebt consolidationF51.000
59399147,112.6400.072675929.27014.130FemaleMarriedBachelor'sEmployedDebt consolidationC11.000
59399276,748.4400.06774016,290.4009.870MaleSingleBachelor'sEmployedDebt consolidationB21.000
59399348,959.5200.0967527,707.73010.310MaleMarriedHigh SchoolEmployedEducationB31.000

Original Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back
1999033,789.7500.04964114,301.95011.310FemaleDivorcedHigh SchoolEmployedHomeD51
1999120,111.6100.0677408,861.8608.300MaleMarriedHigh SchoolEmployedHomeB41
1999237,651.7700.27866411,494.36012.000MaleSingleHigh SchoolEmployedOtherD31
1999324,082.3000.1786746,877.77013.820MaleMarriedPhDRetiredDebt consolidationC31
1999444,960.3300.17679213,300.71011.990FemaleMarriedBachelor'sSelf-employedMedicalB41
1999539,640.0800.27569116,322.23015.050FemaleMarriedBachelor'sEmployedDebt consolidationC50
1999632,062.9000.36775816,697.34011.890FemaleMarriedBachelor'sEmployedDebt consolidationB51
1999718,642.0200.10675123,924.78010.060FemaleSingleMaster'sStudentDebt consolidationB41
1999822,181.3900.27564616,920.13016.060MaleMarriedMaster'sRetiredOtherD21
1999923,737.7000.22863015,769.75013.070FemaleMarriedOtherEmployedBusinessD20

Duplicate rows

Training Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back# duplicates
Dataset does not contain duplicate rows.

Original Data

annual_incomedebt_to_income_ratiocredit_scoreloan_amountinterest_rategendermarital_statuseducation_levelemployment_statusloan_purposegrade_subgradeloan_paid_back# duplicates
Dataset does not contain duplicate rows.